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Response to Amendment 



This communication is in response to the Amendment filed 7/24/2006. Claims 1- 
12,14-37,39-42 are pending in the application. Claims 1-12,14-37,39-42 were amended 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

1. Claims 1-3,6-12,14-18,35-37,39 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Rubin et al ("Rubin", US 2002/0099552) in view of Heck et al ("Heck" 
"A Survey of Web Annotation Systems"). 

As per independent claim 1 , Rubin discloses an apparatus for direct annotation 
of objects, the apparatus comprising: a display device for displaying one or more 
images (Figure 1 item 107); an audio input device for receiving an audio signal (Figure 1 
item 1 18); a storage device for storing a plurality of notations ([0032]) and a direct 
annotation creation module coupled to receive an input audio signal from the audio 
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input device and to receive a reference to a location within an image from the display 
device (Figure 4, [0048] lines 1-6, [0066] lines 1-3), the direct annotation creation 
module , in response to receiving the input audio signal and the reference to the 
location within the image (Figure 4), automatically creating an annotation object, 
independent from the image, that associates the input audio signal with the location 
(Figure 4). Rubin fails to distinctly point out having a plurality of different visual 
notations. However, Heck teaches a plurality of different visual notations (Page 2 lines 
25-28). Therefore it would have been obvious to an artisan at the time of the invention 
to combine the different notations of Heck with the apparatus of Rubin. Motivation to do 
so would have been to provide a distinguishable mark to decipher between annotations. 

As per claim 2, which is dependent on claim 1, Rubin-Heck discloses the 
apparatus of claim 1 further comprising an annotation display module coupled to the 
direct annotation creation module, the annotation display module generating symbols or 
text representing the annotation objects (Rubin, Figure 4 item 401). 

As per claim 3, which is dependent on claim 1 , Rubin-Heck discloses an 
annotation audio output module coupled to the direct annotation creation module, the 
annotation audio output module generating audio output in response to user selection of 
an annotation symbol representing an annotation object (Rubin, [0102] lines 1-25, 
[0132] lines 1-9). 
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As per claim 6, which is dependent on claim 1 , Rubin-Heck discloses the 
apparatus of claim 1 further comprising a media object cache for storing media and 
annotation objects ([0046] -[0048]). 

As per independent claim 7, Rubin discloses an apparatus for use with a system 
for storing, accessing, and presenting objects such as video objects, text objects, audio 
objects, or image objects, direct annotation of objects, the apparatus comprising: a 
direct annotation creation module coupled to receive an input audio signal and 
reference to a locatioawithin an image (Figure 4, [0048] lines1-6, [0066] lines 1-3), the 
direct annotation creation module, in response to receiving the input audio signal or the 
reference to the location within the image, automatically creating an annotation object, 
independent of the image, that associates a symbol or text with the location ([0123]- 
[0132]). Rubin fails to distinctly point out having selected a plurality of different visual 
notations. However, Heck teaches a plurality of different visual notations (Page 2 lines 
25-28). Therefore it would have been obvious to an artisan at the time of the invention 
to combine the different notations of Heck with the apparatus of Rubin. Motivation to do 
so would have been to provide a distinguishable mark to decipher between annotations. 

As per independent claim 8, Rubin discloses an apparatus for direct annotation 
of objects for use with a system for storing, accessing, and presenting objects such as 
video objects, text objects, audio objects, or image objects, the apparatus comprising: a 
direct annotation creation module coupled to receive an input audio signal and a 
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reference to a location within an image (Figure 4, [0048] lines 1-6, [0066] lines 1-3), the 
direct annotation creation module, in response to receiving the input audio signal or the 
reference to the location within the image, automatically creating an annotation object, 
independent of the image (Figure 4, [0123] - [0132]), that associates the input audio 

» 

signal and the location, the annotation object including at least an audio input field, an 
image reference field, and an annotation location field ([0046] - [0049]), the annotation 
audio output module coupled to the direct annotation creation module, the annotation 
audio output module generating audio output in response to user selection of an 
annotation symbol representing the annotation object ([0102] lines 1-25, [0132] lines 1- 
9). Rubin fails to distinctly point out having selected a plurality of different visual 
notations. However, Heck teaches a plurality of different visual notations (Page 2 lines 
25-28). Therefore it would have been obvious to an artisan at the time of the invention 
to combine the different notations of Heck with the apparatus of Rubin. Motivation to do 
so would have been to provide a distinguishable mark to decipher between annotations. 

As per independent claim 9, Rubin discloses an apparatus for direct annotation 
of objects,, the apparatus comprising: a media object storage for storing media and 
annotation objects ([0046] -[0048]); and a direct annotation creation module coupled to 
receive an input audio signal and a reference to a location within an image (Figure 4, 
[0048] lines1-6, [0066] lines 1-3), the direct annotation creation module in response to 
receiving the input audio signal or the reference to the location within the image 
automatically creating an annotation object, independent of the image (Figure 4, [0123] 
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- [0132]), that associates the input audio signal and the location, and the direct 
annotation creation module storing the audio annotation in the media object storage 
([0046] -[0048]). Rubin fails to distinctly point out having selected a plurality of different 
visual notations. However, Heck teaches a plurality of different visual notations (Page 2 
lines 25-28). Therefore it would have been obvious to an artisan at the time of the 
invention to combine the different notations of Heck with the apparatus of Rubin. 
Motivation to do so would have been to provide a distinguishable mark to decipher 
between annotations. 

As per independent claim 10, Rubin discloses a computer implemented method 
for direct annotation of objects, the method comprising the steps of: displaying an image 
(Figure 4, [0028]); receiving audio input ([0129]); detecting selection of a location within 
the image ([0127]-[0128]); and creating an annotation object, independent of the 
selected image, between the selected location and the audio input ([0128]), the 
annotation object including at least an audio input field, an image reference field, and an 
annotation location field ([0046]-[0049]), the creating step occurring automatically in 
response to the receiving or detecting steps ([0129]), and including automatically 
terminating a recording of the audio input based on a predetermined audio level ([0080] 
lines 1-17). Rubin fails to distinctly point out having selected a plurality of different visual 
notations. However, Heck teaches a plurality of different visual notations (Page 2 lines 
25-28) and a text annotation field (Page 2 lines 25-34). Therefore it would have been 
obvious to an artisan at the time of the invention to combine the different notations of 
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Heck with the apparatus of Rubin. Motivation to do so would have been to provide a 
distinguishable mark to decipher between annotations. 

As per claim 11, which is dependent on claim 10, Rubin-Heck discloses a 
method where the step of displaying is performed before or simultaneously with the step 
of receiving (Rubin, [0127] -[0129]). 

As per claim 12, which is dependent on claim 10, Rubin-Heck discloses a 
method wherein the step of receiving is performed before or simultaneously with the 
step of displaying (Rubin, [01 01 ]). 

As per claim 14, which is dependent on claim 10, Rubin-Heck teaches a method 
further comprising the step of displaying the one of the plurality of different visual 
notations to indicate that the image has an annotation (Rubin, Figure 4 item 401 ). 

As per claim 15, which is dependent on claim 14, Rubin-Heck teaches one of the 
plurality of different visual notations being text or a symbol (Rubin, Figure 4 item 401). 

As per claim 16, which is dependent on claim 10, Rubin-Heck discloses a 
method wherein the step of creating an annotation includes creating an annotation 
object and storing the annotation object in an object storage (Rubin, [0046]-[0049]). 

As per claim 17, which is dependent on claim 10, Rubin-Heck discloses a 
method further comprising the step of recording the audio input received ([0041] lines 1- 
23). 

As per claim 18, which is dependent on claim 17, Rubin-Heck discloses a 
method wherein the step of creating an annotation includes creating an annotation 
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object and storing the recorded audio input as part of the annotation object (Rubin, 
[0046]-[0049]). 



As per independent claim 35, Rubin discloses a computer implemented method 
for displaying objects with annotations, the method comprising the steps of: retrieving 
an image (Figure 4); displaying the image with a visual notation that an annotation 
exists (Figure 4 item 401); receiving user selection of the visual notation ([0086] lines 1- 
15, [102] lines 1-25); generating the annotation automatically, in response to user input 
of a location within the image and an audio input, and outputting the annotation 
associated with the selected visual notation ([0080] lines 1-17, [0086] lines 1-15) 
determining whether the annotation includes text; retrieving a text annotation for the 
selected visual notation; and displaying the retrieved text with the image ([0086] lines 1- 
15). Rubin fails to distinctly point out having selected a plurality of different visual 
notations. However, Heck teaches a plurality of different visual notations (Page 2 lines 
25-28). Therefore it would have been obvious to an artisan at the time of the invention 
to combine the different notations of Heck with the apparatus of Rubin. Motivation to do 
so would have been to provide a distinguishable mark to decipher between annotations. 

As per claim 36, which is dependent on claim 35, Rubin-Heck discloses a 
method wherein the annotation is text and the step of outputting is displaying the text 
proximate the image that it annotates (Rubin, [0086] lines 1-15). 
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As per claim 37, which is dependent on claim 35, Rubin-Heck discloses a 
method wherein the annotation is an audio signal and the step of outputting is playing 
the audio signal (Rubin, [0086] lines 1-15). 

As per claim 39, which is dependent on claim 35, Rubin-Heck discloses a 
method further comprising the steps of: determining whether the annotation includes an 
audio signal; retrieving a audio signal for the selected image; and wherein the step of 
outputting is playing the audio signal (Rubin, [0086] lines 1-15). 



2. Claims 4,5,19-34,40-42 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Rubin et al ("Rubin", US 2002/0099552) and Heck et al ("Heck" "A 
Survey of Web Annotation Systems") in view of Mitchell et al ("Mitchell", US 5,857,099). 

i 

As per claim 4, which is dependent on claim 1, Rubin-Heck fails to distinctly point 
out details of the audio voice recognition technology. However, Mitchell discloses the 
apparatus of claim 1 further comprising: an audio vocabulary storage for storing a 
plurality of audio signals and corresponding text strings (Mitchell, Figure 2 item 20); an 
audio vocabulary comparison module coupled to the audio input device (Anderson, 
Column 5 lines 25-65), the audio vocabulary storage and the direct annotation creation 
module, the audio vocabulary comparison module receiving audio input and finding a 
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corresponding text string that matches the audio input (Column 5 lines 25-65); and 
wherein the direct annotation creation module uses text strings found by the audio 
vocabulary comparison module to create the audio annotation (Column 5 lines 25-65). 
Therefore it would have been obvious to an artisan at the time of the invention to 
combine the vocabulary-based audio conversion of Mitchell with the system of Rubin- 
Heck. Motivation to do so would have been to provide a convenient way to convert 
voice to text (Rubin, [0097] lines 1-9). 

As per claim 5, which is dependent on claim 1 , Rubin-Heck-Mitchell discloses the 
apparatus further comprising: an audio vocabulary storage for storing a plurality of audio 
signals and corresponding text strings (Mitchell, Figure 1 item 20); a dynamic 
vocabulary updating module coupled to the audio vocabulary storage and the audio 
input device (Mitchell, Column 5 lines 25-65), the dynamic vocabulary updating module 
for displaying an interface to create a new entry in the audio vocabulary storage 
(Mitchell, Column 5 lines 25-65), the dynamic vocabulary updating module receiving an 
audio input and a text string and creating the new entry in the audio vocabulary storage 
that includes a new visual annotation (Mitchell, Column 5 lines 25-65, Column 9 lines 
45-56). 

As per claim 19, which is dependent on claim 10, Rubin-Heck-Mitchell discloses 
a method, further comprising the step of comparing the audio input to a vocabulary to 
produce text (Mitchell, Column 5 lines 25-65). 
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As per claim 20, which is dependent on claim 19, Rubin-Heck-Mitchell discloses 
a method, wherein the step of creating an annotation includes creating an annotation 
object and storing the text as part of the annotation object (Rubin, [0097] lines 1-9). 

As per claim 21, which is dependent on claim 10, Rubin-Heck-Mitchell discloses 
a method further comprising the steps of comparing the audio input to a vocabulary 
(Mitchell, Column 5 lines 25-65); determining if the audio input has a matching entry in 
the vocabulary (Mitchell, Column 5 lines 25-65); and storing the entry as part of the 
annotation object if the audio input has a matching entry in the vocabulary (Rubin, 
[0097] lines 1-9). 

As per claim 22, which is dependent on claim 21, Rubin-Heck-Mitchell discloses 
a method further comprising the steps of: determining if the audio input has a close 
match in the vocabulary; displaying the close matches; receiving input selecting a close 
match (Mitchell, Column 9 lines 45-56); and storing the selected close match as part of 
the annotation object if the audio input has a close match in the vocabulary (Mitchell, 
Column 9 lines 45-65). 

As per claim 23, which is dependent on claim 22, Rubin-Heck-Mitchell discloses 
the method, further comprising the step of displaying a message that the image has not 
been annotated if there is neither a matching entry in the vocabulary nor a close match 
in the vocabulary (Mitchell, Column 9 lines 45-56). 

As per claim 24, which is dependent on claim 22, Rubin-Heck-Mitchell discloses 
a method, further comprising the following steps if there is neither a matching entry in 
the vocabulary nor a close match in the vocabulary: receiving text input corresponding 
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to the audio input; updating the vocabulary with a new entry including the audio input 
and the text input (Mitchell, Column 9 lines 45-56); and wherein the received text is 
stored as part of the annotation object (Rubin, [0097] lines 1-9/ 

As per claim 25, which is dependent on claim 10, Rubin-Heck-Mitchell discloses 
a method, further comprising the steps of: receiving text input corresponding to the 
audio input (Mitchell, Column 9 lines 45-56); updating the vocabulary with a new entry 
including the audio input and the text input (Rubin, [0097] lines 1-9). 

As per claim 26, Rubin-Heck-Mitchell discloses a method for direct annotation of 
objects, the method comprising the steps of: displaying an image (Rubin, Figure 4); 
receiving audio input (Rubin, [0041] lines 1-23); detecting selection of a location within 
the image (Rubin, [0086] -[0087]); comparing the audio input to a vocabulary to 
produce text (Mitchell, Column 5 lines 25-65); and creating an annotation between the 
selected location one of a plurality of different visual notations (Heck, page 2 lines 25- 
34) and the text (Rubin, [0097] lines 1-9), the annotation object including at least an 
audio input field, an image reference field, and an annotation Ideation field ([0046]- 
[0049]), the creating step occurring automatically in response to the receiving or 
detecting steps (Rubin, [0129]). 

As per claim 27, which is dependent on claim 26, Rubin-Heck-Mitchell discloses 
further comprising the step of recording the audio input received (Rubin, [0041] lines 1- 
23). 

As per claim 28, which is dependent on claim 27, Rubin-Heck-Mitchell discloses 
the method, wherein the step of creating an annotation includes creating an annotation 
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object including a reference to the selected image, the recorded audio input and the text 
(Rubin, [0086] -[0087]), and storing the annotation object in an object storage (Rubin, 
[0046] -[0049]). 

As per claim 29, which is dependent on claim 26, Rubin-Heck-Mitchell discloses 
the method, wherein the step of creating an annotation includes creating an annotation 
object and storing the text as part of the annotation object (Rubin, [0086] -[0087]). 

As per claim 29, which is dependent on claim 26, Rubin-Heck-Mitchell discloses 
a method further comprising the steps of determining if the audio input has a matching 
entry in the vocabulary (Mitchell, Column 5 lines 25-65); and storing the entry as part of 
the annotation object if the audio input has a matching entry in the vocabulary (Rubin, 
[0046] -[0049]). 

As per claim 30, which is dependent on claim 29, Rubin-Heck-Mitchell discloses 
a method, further comprising the steps of: determining if the audio input has a close 
match in the vocabulary; displaying the close matches; receiving input selecting a close 
match (Mitchell, Column 9 lines 45-65); and storing the selected close match as part of 
the annotation object if the audio input has a close match in the vocabulary (Mitchell, 
Column 9 lines 45-65). 

As per claim 31, which is dependent on claim 30, Rubin-Heck-Mitchell discloses 
the method, further comprising the step of displaying a message that the image has not 
been annotated if there is neither a matching entry in the vocabulary nor a close match 
in the vocabulary (Mitchell, Column 9 lines 45-56). 
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As per claim 32, which is dependent on claim 30, Rubin-Heck-Mitchell discloses 
a method, further comprising the following steps if there is neither a matching entry in 
the vocabulary nor a close match in the vocabulary: receiving text input corresponding 
to the audio input (Mitchell, Column 9 lines 45-56); updating the vocabulary with a new 
entry including the audio input and the text input (Mitchell, Column 5 lines 25-65, 
Column 9 lines 45-65); and wherein the received text is stored as part of the annotation 
object (Rubin, [0046] -[0049]). 

As per claim 33, which is dependent on claim 26, Rubin-Heck-Mitchell discloses 
a method, further comprising the steps of: receiving text input corresponding to the 
audio input (Mitchell, Column 9 lines 45-56); updating the vocabulary with a new entry 
including the audio input and the text input (Mitchell, Column 5 lines 25-65, Column 9 
lines 45-65). 

As per independent claim 40, Rubin-Heck-Mitchell discloses a method for 
retrieving images, the method comprising the steps of: receiving audio input (Rubin, 
[0096] -[0099]); determining annotation objects that reference a close match to the 
audio input (Rubin, [0096] -[0099].); automatically creating an annotation object, 
independent of the image (Figure 4, [0123] - [0132]), that associates the input audio 
signal and the location, and the direct annotation creation module automatically 
terminating a recording of the input audio signal based on a predetermined audio level 
([0080] lines 1-17); retrieving the images that are referenced by the determined 
annotation objects (Rubin, [0096] -[0099]); and displaying the retrieved images (Rubin, 
[0096] -[0099]), one of a plurality of different visual notations for the annotation object 
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(Heck, Page 2 lines 25-34)and wherein the annotation object includes at least an audio 
input field, an image reference field, and annotation location field (Rubin, [0046]-[0049]) 

As per claim 41 , which is dependent on claim 40, Rubin-Heck-Mitchell discloses 
a method wherein the step of determining annotation objects further comprising the 
steps of: comparing the audio input to an audio signal reference by an annotation object 
(Column 5 lines 30-35); and determining a close match between the audio input to the 
audio signal reference by an annotation object if a probability metric is greater than a 
threshold (Column 5 lines 35-38). Rubin-Heck-Mitchell fails to disclose a threshold of 
80%. However, Official Notice is taken that a threshold of 80% is well known in the art, 
80% is not a definitive threshold, and could be replaced by any other value. Therefore it 
would have been obvious to combine the method of Rubin-Heck-Mitchell with the 
current teaching. Motivation to do so would have been to provide a standard of 
matching.. 

As per claim 42, which is dependent on claim 40, Rubin-Heck-Mitchell discloses 
a method wherein the step of determining annotation objects further comprising the 
steps of: determining the annotation objects for a plurality of images; for each 
annotation object, comparing the audio input to an audio signal reference by an 
annotation object (Column 5 lines 30-35); and determining a close match between the 
audio input to the audio signal reference by an annotation object if a probability metric is 
greater than an a threshold (Column 5 lines 35-38). Rubin-Heck-Mitchell fails to disclose 
a threshold of 80%. However, Official Notice is taken that a threshold of 80% is well 
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known in the art, 80% is not a definitive threshold, and could be replaced by any other 
value. Therefore it would have been obvious to combine the method of Rubin-Heck- 
Mitchell with the current teaching. Motivation to do so would have been to provide a 
standard of matching. 



Response to Arguments 

Applicant's arguments with respect to claims 1-12,14-37, and 39-42 have been 
considered but are moot in view of the new ground(s) of rejection. 



Conclusion 

Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
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mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 



Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Ryan F Pitaro whose telephone number is 571-272- 
4071. The examiner can normally be reached on 7:00am - 4:30pm M-Th, and 
alternating F. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Kristine Kincaid can be reached on 571-272-4063. The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

Ryan Pitaro 
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